Skip to content

Conversation

@lohanidamodar
Copy link
Contributor

@lohanidamodar lohanidamodar commented Dec 9, 2025

Usage

  • Automatically construct ID using md5 hash

  • Automatically aggregate metric in DB library using upsertDocumentWithIncrease and in ClickHouse using summingMergeTree engine and final select to make sure the aggregation happens

  • Database adapter (utopia-php/database)

  • ClickHouse adapter

Summary by CodeRabbit

  • New Features

    • Pluggable usage tracking library with adapters (database & ClickHouse), Metric model, Query builder, batch & counter logging, period/date-range queries, aggregations, multi-tenant/namespace/shared-tables support, and setup/delegation APIs.
  • Documentation

    • Added README, CONTRIBUTING guidelines, Code of Conduct, and LICENSE.
  • Tests

    • Extensive unit and integration tests covering adapters, models, query parsing, and core usage flows.
  • Chores

    • CI workflows, Docker/Docker Compose support, PHPUnit config, composer scripts/requirements (PHP >=8.0), and style preset.

✏️ Tip: You can customize this high-level summary in your review settings.

- Database adapter
- ClickHouse adapter
@coderabbitai
Copy link

coderabbitai bot commented Dec 9, 2025

Walkthrough

Adds a Utopia Usage subsystem: new public API (Adapter, SQL base adapter, ClickHouse and Database adapters, Usage façade, Metric and Query classes). Introduces SQL-to-storage utilities, deterministic IDs, multi-tenant/namespace support, and batch/counter logging methods. Adds extensive tests and a Usage test trait, phpunit config, Dockerfile and docker-compose for MariaDB/ClickHouse/usage, CI workflows (tests, linter, CodeQL), composer updates (autoload, require-dev, scripts), documentation (README, CONTRIBUTING, CODE_OF_CONDUCT, LICENSE), and Pint config.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Feat: Usage library' clearly and concisely summarizes the main purpose of the pull request—adding a new Usage library to the codebase.
Docstring Coverage ✅ Passed Docstring coverage is 83.53% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@lohanidamodar lohanidamodar requested review from eldadfux and shimonewman and removed request for eldadfux and shimonewman December 9, 2025 03:11
@lohanidamodar lohanidamodar self-assigned this Dec 9, 2025
Copy link

@shimonewman shimonewman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

*
* @throws Exception
*/
public function log(string $metric, int $value, string $period = '1h', array $tags = []): bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use constants for the periods

'tags' => json_encode($tags),
];

if ($this->sharedTables) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we aware for the concept of shared vs not shared tables in this scope? We can completely abstract this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abstract this where? for database library, Database already abstracts it, but for Clickhouse we don't have a separate library, either we create a clickhouse client that handles these stuff, that can be shared here and in audits as well, or we duplicate this logic both here in audit library in clickhouse adapter

$tableName = $this->getTableName();
$escapedDatabaseAndTable = $this->escapeIdentifier($this->database) . '.' . $this->escapeIdentifier($tableName);

$sql = "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how are we protecting against SQL injections here?

foreach ($metrics as $metricData) {
$period = $metricData['period'] ?? '1h';

if (! isset(self::PERIODS[$period])) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add better validation here of the structure we expect. Might even introduce a Metric class

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tests/Usage/UsageBase.php`:
- Around line 86-93: The testGetBetweenDates assertion is a no-op because
assertGreaterThanOrEqual(0, count($results)) always passes; update the assertion
to verify meaningful behavior from $this->usage->getBetweenDates('requests',
$start, $end) — e.g. assert that the results are not empty (assertNotEmpty or
assertGreaterThan(0, count($results))) and optionally assertIsArray($results)
and that at least one item contains expected keys/metrics (e.g. a 'value' or
timestamp) based on the seeded '1h' metrics so the test fails when no data is
returned.
🧹 Nitpick comments (2)
tests/Usage/UsageBase.php (2)

67-72: Potential ClickHouse flakiness due to eventual consistency.

Similar to testGetByPeriod, ClickHouse's SummingMergeTree merges rows asynchronously. The assertion on line 71 may intermittently fail if rows haven't merged yet when running against ClickHouse.

Consider adding eventual consistency handling (e.g., retry loop with timeout) for ClickHouse adapter tests, consistent with fixes applied to other tests in this trait.


150-167: Consider reducing sleep duration or documenting necessity.

The test uses sleep(2) twice, adding 4 seconds to the test suite runtime. While some delay may be necessary for time-based purge testing, consider:

  1. Reducing the sleep duration if shorter intervals suffice
  2. Adding a brief comment explaining why these sleeps are required

Comment on lines +86 to +93
public function testGetBetweenDates(): void
{
$start = DateTime::addSeconds(new \DateTime(), -3600); // 1 hour ago
$end = DateTime::now();

$results = $this->usage->getBetweenDates('requests', $start, $end);
$this->assertGreaterThanOrEqual(0, count($results));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Assertion is effectively a no-op.

count($results) always returns a non-negative integer, so assertGreaterThanOrEqual(0, ...) will always pass. This test doesn't verify any meaningful behavior.

Consider asserting that results are actually returned given the seeded '1h' metrics:

Proposed fix
     public function testGetBetweenDates(): void
     {
         $start = DateTime::addSeconds(new \DateTime(), -3600); // 1 hour ago
         $end = DateTime::now();
 
         $results = $this->usage->getBetweenDates('requests', $start, $end);
-        $this->assertGreaterThanOrEqual(0, count($results));
+        $this->assertGreaterThan(0, count($results), 'Expected at least one result for seeded 1h metrics');
     }
🤖 Prompt for AI Agents
In `@tests/Usage/UsageBase.php` around lines 86 - 93, The testGetBetweenDates
assertion is a no-op because assertGreaterThanOrEqual(0, count($results)) always
passes; update the assertion to verify meaningful behavior from
$this->usage->getBetweenDates('requests', $start, $end) — e.g. assert that the
results are not empty (assertNotEmpty or assertGreaterThan(0, count($results)))
and optionally assertIsArray($results) and that at least one item contains
expected keys/metrics (e.g. a 'value' or timestamp) based on the seeded '1h'
metrics so the test fails when no data is returned.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/Usage/Adapter/ClickHouse.php`:
- Around line 424-427: The tenant column is declared as 'tenant
Nullable(UInt64)' while the table ORDER BY uses (tenant, id), which breaks
MergeTree since nullable keys are disallowed; change the column definition in
the columns array (where $this->sharedTables adds 'tenant Nullable(UInt64)') to
a non-nullable type with a sentinel default (e.g., 'tenant UInt64 DEFAULT 0')
and update any tenant value assignment logic elsewhere to insert 0 when no
tenant is present; alternatively, if you prefer the other approach, add
"SETTINGS allow_nullable_key = 1" to the table creation settings instead (less
recommended for performance).
- Around line 749-763: The validation block for the tenant field mistakenly
reads $metricData['$tenant'] instead of $metricData['tenant'], causing the wrong
key to be accessed; update the access in the code that assigns $tenantValue (the
variable $tenantValue assigned from $metricData) to use 'tenant' (no dollar
sign), keeping the existing checks (is_int, ctype_digit, negative check and
exception messages) intact so the non-negative integer validation on
$tenantValue works as intended.

…tamps and improve deterministic ID generation
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/Usage/Adapter/ClickHouse.php`:
- Around line 1085-1097: The case label uses the non-existent constant
Query::TYPE_SELECT causing a runtime error; update the switch to use
Query::TYPE_IN instead (replace Query::TYPE_SELECT with Query::TYPE_IN) so the
IN-value handling block (the code that calls validateAttributeName,
escapeIdentifier, formatParamValue and appends to $filters with an IN (...)
clause) matches the correct Query constant.

In `@src/Usage/Adapter/Database.php`:
- Around line 66-70: The code sets $time to a sentinel string for period 'inf'
in the log path but uses null in logBatch, causing inconsistent IDs; update both
places (the $time assignment in the log() method and the analogous assignment in
logBatch()) to use null when $period === 'inf' (i.e., set $time = $period ===
'inf' ? null : $now->format(Usage::PERIODS[$period])) and ensure any DB insert
logic that writes $time accepts null consistently so log() and logBatch()
produce the same bucket value.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/Usage/Adapter/Database.php`:
- Around line 184-189: The IN branch currently maps Query::TYPE_IN to
DatabaseQuery::contains($attribute, $values), which produces substring/element
matching; change it to use DatabaseQuery::equal($attribute, $values) so
Query::in(...) yields exact IN semantics for scalar columns: locate the switch
handling Query::TYPE_IN in Database.php and replace the call to
DatabaseQuery::contains with DatabaseQuery::equal, preserving the $attribute and
$values usage and the surrounding typing comment.

In `@src/Usage/Query.php`:
- Around line 83-93: The equal() method currently requires array $value which
breaks the established pattern; change Query::equal to accept mixed $value and
internally normalize it to an array (e.g., if not an array, wrap into [$value])
before constructing the Query, and update its implementation accordingly; then
remove the manual wrapping in the ClickHouse code that currently passes
[$metric] to Query::equal so values are not double-wrapped (look for the places
in ClickHouse where equal() is called and adjust those call sites).
🧹 Nitpick comments (6)
src/Usage/Usage.php (1)

60-63: Use constant for default period value.

The default parameter uses the string literal '1h' instead of self::PERIOD_1H. This creates an inconsistency since the constant is defined and used elsewhere.

Suggested fix
-    public function log(string $metric, int $value, string $period = '1h', array $tags = []): bool
+    public function log(string $metric, int $value, string $period = self::PERIOD_1H, array $tags = []): bool
tests/Usage/UsageBase.php (1)

150-167: Consider reducing sleep duration or using polling for faster tests.

The test uses sleep(2) twice (4 seconds total), making it slow. Consider reducing the sleep durations or implementing a retry/polling mechanism to speed up the test suite while still ensuring the purge behavior is validated.

Potential improvement
     public function testPurge(): void
     {
-        sleep(2);
+        sleep(1);

         // Add a metric
         $this->usage->log('purge-test', 999, '1h');

         // Wait a bit
-        sleep(2);
+        sleep(1);

         // Purge all metrics
         $status = $this->usage->purge(DateTime::now());
src/Usage/Adapter/ClickHouse.php (1)

424-427: Nullable tenant column with allow_nullable_key=1 is functional but suboptimal.

The tenant column is Nullable(UInt64) and used in ORDER BY with allow_nullable_key = 1 setting. While this works, using a non-nullable type with a sentinel default (e.g., UInt64 DEFAULT 0) would be more performant for ClickHouse query optimization.

Also applies to: 450-459

src/Usage/Adapter/Database.php (2)

55-59: Suppress unused parameter warning.

The $id parameter is required by the abstract SQL parent class but unused in this adapter. Add a suppression annotation to silence the static analysis warning.

Suggested fix
     /**
      * Get column definition for Database adapter (not used, but required by SQL parent)
+     *
+     * `@SuppressWarnings`(PHPMD.UnusedFormalParameter)
      */
     protected function getColumnDefinition(string $id): string
     {

270-281: Inefficient sum implementation fetches all rows.

sumByPeriod retrieves all matching documents and iterates in PHP to compute the sum. For large datasets, this is memory-intensive. Consider using database-level aggregation if Utopia\Database supports it, or document this limitation.

Consider using database sum if available
// If Utopia\Database supports sum aggregation:
// return $this->db->sum($this->collection, 'value', $dbQueries);

// Otherwise, document the limitation and consider pagination
// for very large result sets to avoid memory issues
src/Usage/Query.php (1)

5-10: Update class docblock - not just for ClickHouse.

The docblock states "Query class for ClickHouse Audit adapter" but this class is used by both ClickHouse and Database adapters. Update the description to reflect its general purpose.

Suggested fix
 /**
- * Query class for ClickHouse Audit adapter
+ * Query class for Usage adapters
  *
- * Provides a fluent interface for building ClickHouse queries.
- * Contains only methods needed for audit log operations.
+ * Provides a fluent interface for building usage metric queries.
+ * Used by all adapter implementations (Database, ClickHouse, etc.)
  */

…ods; update ClickHouse and Database adapters to support new query types and improve handling of array values
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/Usage/Adapter/ClickHouse.php`:
- Around line 1016-1157: The CodeQL issue is that Query::getValues() always
returns an array, so all is_array($values) conditional branches are dead; update
the switch handling in ClickHouse.php (cases like Query::TYPE_EQUAL,
TYPE_LESSER, TYPE_GREATER, TYPE_BETWEEN, TYPE_CONTAINS, TYPE_LESSER_EQUAL,
TYPE_GREATER_EQUAL, TYPE_ORDER_ASC/DESC) to treat $values as an array
unconditionally: remove is_array checks and else branches, iterate $values
directly (use $values[0] ?? null for single-value semantics or when selecting
first element), and simplify parameter construction (e.g., for BETWEEN use
$values[0] and $values[1] with null coalescing); keep special handling for
attribute === 'time' and preserve param naming (param_ . $paramCounter++) and
usage of formatDateTime/formatParamValue and
escapeIdentifier/validateAttributeName.

In `@src/Usage/Adapter/Database.php`:
- Around line 189-199: Add scalar type validation and matching PHPDoc for the
TYPE_LESSER_EQUAL and TYPE_GREATER_EQUAL branches: mirror the pattern used by
TYPE_GREATER, TYPE_LESSER, and TYPE_BETWEEN by adding a PHPDoc like /** `@var`
bool|float|int|string $value */ above the extraction and add a runtime guard
(e.g., if (!is_scalar($value) && !is_bool($value)) continue; or use
is_scalar($value)) before calling DatabaseQuery::lessThanEqual($attribute,
$value) and DatabaseQuery::greaterThanEqual($attribute, $value) so non-scalar
inputs are ignored and TypeErrors are avoided.
🧹 Nitpick comments (1)
src/Usage/Adapter/Database.php (1)

52-59: Silence unused parameter warning.

PHPMD flags $id as unused; consider renaming to $_id or suppressing for clarity.

Comment on lines 189 to 199
case Query::TYPE_LESSER_EQUAL:
if (!empty($values)) {
$value = $values[0];
$dbQueries[] = DatabaseQuery::lessThanEqual($attribute, $value);
}
break;
case Query::TYPE_GREATER_EQUAL:
if (!empty($values)) {
$value = $values[0];
$dbQueries[] = DatabaseQuery::greaterThanEqual($attribute, $value);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's find the DatabaseQuery class location
find . -type f -name "*.php" | xargs grep -l "class DatabaseQuery" | head -5

Repository: utopia-php/usage

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Let's check what DatabaseQuery is imported from in the target file
head -50 src/Usage/Adapter/Database.php | grep -E "^use|^import|DatabaseQuery"

Repository: utopia-php/usage

Length of output: 339


🏁 Script executed:

#!/bin/bash
# Now let's find the Query class in vendor
find vendor -type f -path "*/Utopia/Database/Query.php" 2>/dev/null | head -3

Repository: utopia-php/usage

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Let's check if vendor exists and look at the structure
ls -la vendor/ 2>/dev/null | head -20
# Also check the size of Database.php
wc -l src/Usage/Adapter/Database.php

Repository: utopia-php/usage

Length of output: 94


🏁 Script executed:

#!/bin/bash
# Read the full Database.php file to understand context
cat -n src/Usage/Adapter/Database.php

Repository: utopia-php/usage

Length of output: 14951


🏁 Script executed:

#!/bin/bash
# Let's find the Utopia\Usage\Query class
find . -type f -name "*.php" | xargs grep -l "class Query" | grep -v vendor | head -10

Repository: utopia-php/usage

Length of output: 109


🏁 Script executed:

#!/bin/bash
# Read the Query class
cat -n src/Usage/Query.php

Repository: utopia-php/usage

Length of output: 9531


🏁 Script executed:

#!/bin/bash
# Let's check for test files related to Database adapter
find tests -type f -name "*Database*" -o -name "*database*" | head -10

Repository: utopia-php/usage

Length of output: 96


🏁 Script executed:

#!/bin/bash
# Read the Database test file
wc -l tests/Usage/Adapter/DatabaseTest.php
head -100 tests/Usage/Adapter/DatabaseTest.php

Repository: utopia-php/usage

Length of output: 1521


🏁 Script executed:

#!/bin/bash
# Find UsageBase trait
find tests -type f -name "*.php" | xargs grep -l "trait UsageBase"

Repository: utopia-php/usage

Length of output: 85


🏁 Script executed:

#!/bin/bash
# Read UsageBase trait
wc -l tests/Usage/UsageBase.php
cat -n tests/Usage/UsageBase.php

Repository: utopia-php/usage

Length of output: 9432


🏁 Script executed:

#!/bin/bash
# Check git log for recent changes to Database.php or references to lessThanEqual/greaterThanEqual
git log --oneline -20 -- src/Usage/Adapter/Database.php 2>/dev/null | head -20

Repository: utopia-php/usage

Length of output: 245


🏁 Script executed:

#!/bin/bash
# Show the details of that commit
git show a89db31 --stat

Repository: utopia-php/usage

Length of output: 1743


🏁 Script executed:

#!/bin/bash
# Check the actual diff for the recent commit to see what was changed
git show a89db31 -- src/Usage/Adapter/Database.php | head -150

Repository: utopia-php/usage

Length of output: 4999


🏁 Script executed:

#!/bin/bash
# Check git status and recent commits to see if there are any related fixes or issues
git log --all --oneline | grep -i "lessThanEqual\|greaterThanEqual\|type.*guard\|scalar" | head -10

Repository: utopia-php/usage

Length of output: 245


Add type validation for lessThanEqual/greaterThanEqual to match pattern of other comparison operators.

Unlike TYPE_GREATER (line 165), TYPE_LESSER (line 172), and TYPE_BETWEEN (line 181) which include PHPDoc type hints specifying bool|float|int|string, the TYPE_LESSER_EQUAL and TYPE_GREATER_EQUAL cases lack this documentation and runtime validation. To maintain consistency and prevent TypeError when non-scalar values are passed, add explicit scalar type guards:

Suggested fix
                 case Query::TYPE_LESSER_EQUAL:
                     if (!empty($values)) {
+                        /** `@var` bool|float|int|string $value */
                         $value = $values[0];
                         $dbQueries[] = DatabaseQuery::lessThanEqual($attribute, $value);
                     }
                     break;
                 case Query::TYPE_GREATER_EQUAL:
                     if (!empty($values)) {
+                        /** `@var` bool|float|int|string $value */
                         $value = $values[0];
                         $dbQueries[] = DatabaseQuery::greaterThanEqual($attribute, $value);
                     }
                     break;
🧰 Tools
🪛 GitHub Actions: CodeQL

[error] 192-192: Parameter #2 $value of static method Utopia\Database\Query::lessThanEqual() expects bool|float|int|string, mixed given.


[error] 198-198: Parameter #2 $value of static method Utopia\Database\Query::greaterThanEqual() expects bool|float|int|string, mixed given.

🤖 Prompt for AI Agents
In `@src/Usage/Adapter/Database.php` around lines 189 - 199, Add scalar type
validation and matching PHPDoc for the TYPE_LESSER_EQUAL and TYPE_GREATER_EQUAL
branches: mirror the pattern used by TYPE_GREATER, TYPE_LESSER, and TYPE_BETWEEN
by adding a PHPDoc like /** `@var` bool|float|int|string $value */ above the
extraction and add a runtime guard (e.g., if (!is_scalar($value) &&
!is_bool($value)) continue; or use is_scalar($value)) before calling
DatabaseQuery::lessThanEqual($attribute, $value) and
DatabaseQuery::greaterThanEqual($attribute, $value) so non-scalar inputs are
ignored and TypeErrors are avoided.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/Usage/Adapter/ClickHouse.php`:
- Around line 506-526: formatDateTime currently doesn't return for types other
than null, DateTime, or string (causing a TypeError); update formatDateTime to
cover other inputs by: if $dateTime is an integer (timestamp) create a DateTime
from that timestamp and return formatted string, else throw a clear
InvalidArgumentException (or Exception) that includes the input type/value so
the method always returns a string or fails explicitly; reference the
formatDateTime method to locate where to add the integer-handling block and the
final exception.

In `@src/Usage/Adapter/Database.php`:
- Around line 229-237: The DatabaseQuery::orderDesc() calls in the getByPeriod
method's authorization-skip closure are missing the required string $attribute
argument; update those calls to pass 'time' (e.g.,
DatabaseQuery::orderDesc('time')) so ordering matches ClickHouse and tests;
apply the same fix to the other DatabaseQuery::orderDesc() invocations in this
file (the other authorization closure further down) and ensure the calls inside
convertQueriesToDatabase/getAuthorization()->skip closures consistently pass
'time'.
🧹 Nitpick comments (3)
src/Usage/Adapter/ClickHouse.php (2)

601-604: Redundant type check.

The $tags parameter has type hint array (line 581), so is_array($tags) is always true. This defensive check is harmless but unnecessary.


1141-1160: Simplify redundant is_array checks.

Query::getValues() always returns an array, making the is_array($values) checks redundant. The current structure is confusing because $singleValue is assigned conditionally but used unconditionally.

♻️ Cleaner implementation
                 case Query::TYPE_LESSER_EQUAL:
                     $this->validateAttributeName($attribute);
                     $escapedAttr = $this->escapeIdentifier($attribute);
                     $paramName = 'param_' . $paramCounter++;
+                    $singleValue = $values[0] ?? null;
                     if ($attribute === 'time') {
-                        if (is_array($values)) {
-                            /** `@var` \DateTime|string|null $singleValue */
-                            $singleValue = $values[0] ?? null;
-                        }
                         $filters[] = "{$escapedAttr} <= {{$paramName}:DateTime64(3)}";
                         $params[$paramName] = $this->formatDateTime($singleValue);
                     } else {
-                        if (is_array($values)) {
-                            /** `@var` bool|float|int|string $singleValue */
-                            $singleValue = $values[0] ?? null;
-                        }
                         $filters[] = "{$escapedAttr} <= {{$paramName}:String}";
                         $params[$paramName] = $this->formatParamValue($singleValue);
                     }
                     break;
src/Usage/Adapter/Database.php (1)

52-58: Silence PHPMD unused-parameter warning for required override.

getColumnDefinition(string $id) is required by the abstract base but not used here. Add a suppression to avoid noisy analysis warnings.

🧹 Suggested tweak
-    /**
-     * Get column definition for Database adapter (not used, but required by SQL parent)
-     */
+    /**
+     * Get column definition for Database adapter (not used, but required by SQL parent)
+     *
+     * `@SuppressWarnings`(PHPMD.UnusedFormalParameter)
+     */
     protected function getColumnDefinition(string $id): string

Comment on lines +229 to +237
public function getByPeriod(string $metric, string $period, array $queries = []): array
{
/** @var array<Document> $result */
$result = $this->db->getAuthorization()->skip(function () use ($queries, $metric, $period) {
$dbQueries = $this->convertQueriesToDatabase($queries);
$dbQueries[] = DatabaseQuery::equal('metric', [$metric]);
$dbQueries[] = DatabaseQuery::equal('period', [$period]);
$dbQueries[] = DatabaseQuery::orderDesc();

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Find the DatabaseQuery class definition
fd -t f "DatabaseQuery.php" -o -t f "*Query.php" | head -20

Repository: utopia-php/usage

Length of output: 176


🏁 Script executed:

# Search for orderDesc method definition
rg -n "function orderDesc|public.*orderDesc" -A 3

Repository: utopia-php/usage

Length of output: 505


🏁 Script executed:

# Check usage patterns of orderDesc throughout codebase
rg "orderDesc\(" -B 2 -A 1 | head -50

Repository: utopia-php/usage

Length of output: 1884


Fix missing required parameter in DatabaseQuery::orderDesc() calls.

DatabaseQuery::orderDesc() requires a string $attribute parameter, but is called without arguments. This will cause a runtime error. Pass 'time' to match the pattern in ClickHouse.php and tests:

Required fix
-            $dbQueries[] = DatabaseQuery::orderDesc();
+            $dbQueries[] = DatabaseQuery::orderDesc('time');

Also applies to: lines 247–256

🤖 Prompt for AI Agents
In `@src/Usage/Adapter/Database.php` around lines 229 - 237, The
DatabaseQuery::orderDesc() calls in the getByPeriod method's authorization-skip
closure are missing the required string $attribute argument; update those calls
to pass 'time' (e.g., DatabaseQuery::orderDesc('time')) so ordering matches
ClickHouse and tests; apply the same fix to the other DatabaseQuery::orderDesc()
invocations in this file (the other authorization closure further down) and
ensure the calls inside convertQueriesToDatabase/getAuthorization()->skip
closures consistently pass 'time'.

- Introduced logCounter method to log individual usage counter metrics with upsert behavior.
- Added logBatchCounter method to log multiple usage counter metrics in batch, allowing for individual entries without aggregation.
- Updated logBatch method to accept a batch size parameter for better control over batch processing.
- Enhanced Usage class to support new logging methods and batch size functionality.
- Implemented tests for new methods, including scenarios for batch sizes, counter behavior, and metrics with tags.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@src/Usage/Adapter/ClickHouse.php`:
- Around line 737-750: The buildInsertColumns() method appends $attribute['$id']
which is currently typed as mixed; update the loop in buildInsertColumns() to
assert and handle the type (e.g., use is_string($attribute['$id']) and either
cast to string before appending or skip/throw on invalid types) so the returned
array contains only strings and matches the declared return type; reference the
getAttributes() iteration and the '$id' key in the attribute to locate where to
add the type check and handling.
- Around line 706-717: The loop uses $attribute['$id'] without guaranteeing it's
a string before calling getColumnType; cast and validate it first: retrieve the
raw id from getAttributes() into a local variable, assert/cast it to string
(e.g., (string)$rawId) and optionally handle missing/empty ids (skip or throw)
before calling getColumnType($attrId), and then use that typed $attrId for
building $attrKey, $queryParams and $valuePlaceholders so getColumnType always
receives a string.
- Around line 850-863: The CodeQL/static-analysis warnings come from untyped
array accesses in logBatchCounter (and similarly in logBatch); after
validateMetricsBatch has run, add explicit type assertions/casts when extracting
$metricData elements in logBatchCounter (cast $metric to string, $value to
int/float as appropriate, $tags to array, $period to string/int, and $tenant to
string|null when using resolveTenantFromMetric) before passing them into
buildInsertValuesForMetric; update the extraction lines in logBatchCounter (and
the analogous lines in logBatch) to perform these casts so static analysis
recognizes the narrowed types while leaving validation logic unchanged.
🧹 Nitpick comments (4)
tests/Usage/UsageBase.php (2)

302-330: Remove unused variables $aggCount and $counterCount.

These variables are assigned but never used in assertions. Either remove them or add assertions to verify the count behavior matches expectations.

Proposed fix
         $aggSum = $this->usage->sumByPeriod('agg-vs-counter', '1h');
-        $aggCount = $this->usage->countByPeriod('agg-vs-counter', '1h');
 
         // Clear for counter test
         $this->usage->purge(DateTime::now());
 
         // Log same counter metric 3 times (last one wins)
         $this->assertTrue($this->usage->logCounter('counter-vs-agg', 10, '1h', []));
         $this->assertTrue($this->usage->logCounter('counter-vs-agg', 20, '1h', []));
         $this->assertTrue($this->usage->logCounter('counter-vs-agg', 30, '1h', []));
 
         $counterSum = $this->usage->sumByPeriod('counter-vs-agg', '1h');
-        $counterCount = $this->usage->countByPeriod('counter-vs-agg', '1h');

173-195: Assertions always pass - consider strengthening.

Both testLessThanEqualQuery and testGreaterThanEqualQuery use assertGreaterThanOrEqual(0, count($results)) which always passes. If the intent is only to verify no exceptions are thrown, consider using assertIsArray($results) instead for clarity.

src/Usage/Adapter/Database.php (2)

93-140: $batchSize parameter is unused - consider implementing or documenting.

The logBatch method ignores the $batchSize parameter and processes all metrics in a single upsert operation. While this may be acceptable for the Database adapter, it differs from the ClickHouse adapter which respects the batch size for chunked inserts. Consider either:

  1. Implementing batch chunking for consistency
  2. Adding a comment explaining why batching isn't needed for this adapter

388-407: Consider using batch deletion for better performance.

The purge method deletes documents one-by-one in a loop. For large datasets, this could be slow. Consider using a batch deletion approach if the underlying database supports it.

Comment on lines 706 to 717

// Add attributes dynamically - must include ALL attributes in schema order
foreach ($this->getAttributes() as $attribute) {
$attrId = $attribute['$id'];

$attrKey = $attrId . ($paramCounter > 0 ? '_' . $paramCounter : '');
$type = $this->getColumnType($attrId);

// Use the value from map, or null if not present
$queryParams[$attrKey] = $attributeMap[$attrId] ?? null;
$valuePlaceholders[] = '{' . $attrKey . ':' . $type . '}';
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add type assertion for $attrId before calling getColumnType.

Proposed fix
         // Add attributes dynamically - must include ALL attributes in schema order
         foreach ($this->getAttributes() as $attribute) {
-            $attrId = $attribute['$id'];
+            $attrId = $attribute['$id'];
+            if (!is_string($attrId)) {
+                continue;
+            }

             $attrKey = $attrId . ($paramCounter > 0 ? '_' . $paramCounter : '');
             $type = $this->getColumnType($attrId);
🧰 Tools
🪛 GitHub Actions: CodeQL

[error] 712-712: Parameter #1 $id of method Utopia\Usage\Adapter\ClickHouse::getColumnType() expects string, mixed given.

🤖 Prompt for AI Agents
In `@src/Usage/Adapter/ClickHouse.php` around lines 706 - 717, The loop uses
$attribute['$id'] without guaranteeing it's a string before calling
getColumnType; cast and validate it first: retrieve the raw id from
getAttributes() into a local variable, assert/cast it to string (e.g.,
(string)$rawId) and optionally handle missing/empty ids (skip or throw) before
calling getColumnType($attrId), and then use that typed $attrId for building
$attrKey, $queryParams and $valuePlaceholders so getColumnType always receives a
string.

Comment on lines 737 to 750
private function buildInsertColumns(): array
{
$insertColumns = ['id'];

foreach ($this->getAttributes() as $attribute) {
$insertColumns[] = $attribute['$id'];
}

if ($this->sharedTables) {
$insertColumns[] = 'tenant';
}

return $insertColumns;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add type assertion for attribute ID to fix return type.

$attribute['$id'] is typed as mixed, causing the return type mismatch. Add a type check.

Proposed fix
     private function buildInsertColumns(): array
     {
         $insertColumns = ['id'];

         foreach ($this->getAttributes() as $attribute) {
-            $insertColumns[] = $attribute['$id'];
+            $id = $attribute['$id'];
+            $insertColumns[] = is_string($id) ? $id : '';
         }

         if ($this->sharedTables) {
             $insertColumns[] = 'tenant';
         }

         return $insertColumns;
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
private function buildInsertColumns(): array
{
$insertColumns = ['id'];
foreach ($this->getAttributes() as $attribute) {
$insertColumns[] = $attribute['$id'];
}
if ($this->sharedTables) {
$insertColumns[] = 'tenant';
}
return $insertColumns;
}
private function buildInsertColumns(): array
{
$insertColumns = ['id'];
foreach ($this->getAttributes() as $attribute) {
$id = $attribute['$id'];
$insertColumns[] = is_string($id) ? $id : '';
}
if ($this->sharedTables) {
$insertColumns[] = 'tenant';
}
return $insertColumns;
}
🧰 Tools
🪛 GitHub Actions: CodeQL

[error] 749-749: Method Utopia\Usage\Adapter\ClickHouse::buildInsertColumns() should return array but returns array<int, mixed>.

🤖 Prompt for AI Agents
In `@src/Usage/Adapter/ClickHouse.php` around lines 737 - 750, The
buildInsertColumns() method appends $attribute['$id'] which is currently typed
as mixed; update the loop in buildInsertColumns() to assert and handle the type
(e.g., use is_string($attribute['$id']) and either cast to string before
appending or skip/throw on invalid types) so the returned array contains only
strings and matches the declared return type; reference the getAttributes()
iteration and the '$id' key in the attribute to locate where to add the type
check and handling.

Comment on lines 850 to 863
foreach ($metricsBatch as $metricData) {
$period = $metricData['period'] ?? Usage::PERIOD_1H;
$metric = $metricData['metric'];
$value = $metricData['value'];
$tags = (array) ($metricData['tags'] ?? []);

// Build values for this metric
$tenant = $this->sharedTables ? $this->resolveTenantFromMetric($metricData) : null;
$result = $this->buildInsertValuesForMetric($metric, $value, $period, $tags, $tenant, $paramCounter);

$queryParams = array_merge($queryParams, $result['queryParams']);
$valueClauses[] = '(' . implode(', ', $result['valuePlaceholders']) . ')';
$paramCounter++;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add type assertions to satisfy static analysis after validation.

The batch processing in logBatchCounter extracts values from $metricData without type assertions, causing CodeQL errors. Since validateMetricsBatch already validates the types, add explicit casts to narrow the types for static analysis.

Proposed fix
             foreach ($metricsBatch as $metricData) {
                 $period = $metricData['period'] ?? Usage::PERIOD_1H;
                 $metric = $metricData['metric'];
                 $value = $metricData['value'];
                 $tags = (array) ($metricData['tags'] ?? []);
+                
+                // Type assertions (validated by validateMetricsBatch)
+                $metric = is_string($metric) ? $metric : '';
+                $value = is_int($value) ? $value : 0;
+                $period = is_string($period) ? $period : Usage::PERIOD_1H;

                 // Build values for this metric
                 $tenant = $this->sharedTables ? $this->resolveTenantFromMetric($metricData) : null;
                 $result = $this->buildInsertValuesForMetric($metric, $value, $period, $tags, $tenant, $paramCounter);

Also applies to logBatch at lines 966-974.

🧰 Tools
🪛 GitHub Actions: CodeQL

[error] 858-858: Parameter #1 $metric of method Utopia\Usage\Adapter\ClickHouse::buildInsertValuesForMetric() expects string, mixed given.


[error] 858-858: Parameter #2 $value of method Utopia\Usage\Adapter\ClickHouse::buildInsertValuesForMetric() expects int, mixed given.


[error] 858-858: Parameter #3 $period of method Utopia\Usage\Adapter\ClickHouse::buildInsertValuesForMetric() expects string, mixed given.


[error] 858-858: Parameter #4 $tags of method Utopia\Usage\Adapter\ClickHouse::buildInsertValuesForMetric() expects array<string, mixed>, array given.

🤖 Prompt for AI Agents
In `@src/Usage/Adapter/ClickHouse.php` around lines 850 - 863, The
CodeQL/static-analysis warnings come from untyped array accesses in
logBatchCounter (and similarly in logBatch); after validateMetricsBatch has run,
add explicit type assertions/casts when extracting $metricData elements in
logBatchCounter (cast $metric to string, $value to int/float as appropriate,
$tags to array, $period to string/int, and $tenant to string|null when using
resolveTenantFromMetric) before passing them into buildInsertValuesForMetric;
update the extraction lines in logBatchCounter (and the analogous lines in
logBatch) to perform these casts so static analysis recognizes the narrowed
types while leaving validation logic unchanged.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/Usage/Adapter/ClickHouse.php`:
- Around line 683-688: When PERIOD_INF is used the code sets $time to null which
yields an empty string param and lets ClickHouse coerce to 1970 implicitly;
instead set a deterministic sentinel datetime for $timestamp when
Usage::PERIOD_INF is selected. Update the block around $now/$time/$timestamp
(used with Usage::PERIODS and formatDateTime) to assign a concrete sentinel
string (e.g. '1970-01-01 00:00:00.000') and pass that through
$this->formatDateTime so $timestamp is never null, ensuring partition key
behavior is explicit and deterministic.

Comment on lines 683 to 688
$now = new \DateTime();
$time = $period === Usage::PERIOD_INF
? null
: $now->format(Usage::PERIODS[$period]);
$timestamp = $time !== null ? $this->formatDateTime($time) : null;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd /tmp && find . -type f -name "ClickHouse.php" 2>/dev/null | head -5

Repository: utopia-php/usage

Length of output: 42


🏁 Script executed:

# First, let's locate the file
find . -type f -name "ClickHouse.php" -path "*/Usage/Adapter/*" 2>/dev/null

Repository: utopia-php/usage

Length of output: 94


🏁 Script executed:

# Let's search from the repo root
git ls-files | grep -i clickhouse

Repository: utopia-php/usage

Length of output: 131


🏁 Script executed:

# Read the relevant file section
head -n 700 src/Usage/Adapter/ClickHouse.php | tail -n 50

Repository: utopia-php/usage

Length of output: 1626


🏁 Script executed:

# Get more context around lines 683-688
sed -n '670,700p' src/Usage/Adapter/ClickHouse.php

Repository: utopia-php/usage

Length of output: 978


🏁 Script executed:

# Search for table schema or CREATE TABLE statement
rg -i "CREATE TABLE|table_schema" src/Usage/Adapter/ClickHouse.php | head -20

Repository: utopia-php/usage

Length of output: 202


🏁 Script executed:

# Search for the column mapping/attribute map where timestamp is used
sed -n '680,750p' src/Usage/Adapter/ClickHouse.php

Repository: utopia-php/usage

Length of output: 2342


🏁 Script executed:

# Search for how attributes/columns are mapped in the insert
rg -A 30 "attributeMap.*=" src/Usage/Adapter/ClickHouse.php | head -60

Repository: utopia-php/usage

Length of output: 1246


🏁 Script executed:

# Find the complete CREATE TABLE statement
rg -B 5 -A 50 "CREATE TABLE IF NOT EXISTS" src/Usage/Adapter/ClickHouse.php | head -150

Repository: utopia-php/usage

Length of output: 2819


🏁 Script executed:

# Also check if there are any migration files or schema definitions
find . -type f \( -name "*.php" -o -name "*.sql" \) | xargs grep -l "time.*DateTime" 2>/dev/null | grep -i schema

Repository: utopia-php/usage

Length of output: 42


🏁 Script executed:

# Search for column type definitions or schema initialization
rg "time.*DateTime|PARTITION BY" src/Usage/Adapter/ClickHouse.php

Repository: utopia-php/usage

Length of output: 438


🏁 Script executed:

# Search for tests related to PERIOD_INF or this code path
rg -i "PERIOD_INF|period_inf" tests/ src/

Repository: utopia-php/usage

Length of output: 263


🏁 Script executed:

# Check if there's error handling around NULL inserts
rg -B 3 -A 3 "try|catch|Exception.*time|Exception.*NULL" src/Usage/Adapter/ClickHouse.php | head -50

Repository: utopia-php/usage

Length of output: 1853


🏁 Script executed:

# Search for how the query is actually executed with parameters
rg -B 5 -A 10 "query.*INSERT.*valuePlaceholders|execute.*INSERT" src/Usage/Adapter/ClickHouse.php | head -80

Repository: utopia-php/usage

Length of output: 42


🏁 Script executed:

# Search for formatParamValue function
rg -A 20 "function formatParamValue|formatParamValue.*=" src/Usage/Adapter/ClickHouse.php

Repository: utopia-php/usage

Length of output: 571


🏁 Script executed:

# Also check the full picture - how parameters are formatted in the HTTP request
rg -B 10 -A 15 "formatParamValue" src/Usage/Adapter/ClickHouse.php | head -80

Repository: utopia-php/usage

Length of output: 3381


🏁 Script executed:

# Check the Usage.php file for PERIODS mapping
rg -B 3 -A 10 "PERIODS.*=" src/Usage/Usage.php

Repository: utopia-php/usage

Length of output: 394


🏁 Script executed:

# Search for getColumnType function
rg -A 30 "function getColumnType" src/Usage/Adapter/ClickHouse.php

Repository: utopia-php/usage

Length of output: 1070


🏁 Script executed:

# Also check how 'time' is defined as an attribute
rg -B 5 -A 15 "getAttributes.*=|'time'.*DateTime" src/Usage/Adapter/ClickHouse.php | head -60

Repository: utopia-php/usage

Length of output: 42


🏁 Script executed:

# Search for where 'time' attribute is defined or set
rg -B 5 -A 5 "'time'" src/Usage/Adapter/ClickHouse.php | grep -A 5 -B 5 "required\|type"

Repository: utopia-php/usage

Length of output: 42


🏁 Script executed:

# Check the attributes array or getAttributes implementation
rg -B 10 -A 30 "function getAttributes|protected.*attributes" src/Usage/Adapter/ClickHouse.php | head -100

Repository: utopia-php/usage

Length of output: 42


🏁 Script executed:

# Look for attribute initialization in parent class or base
fd "Adapter.php" src/ --type f

Repository: utopia-php/usage

Length of output: 81


🏁 Script executed:

# Check the base Adapter class
cat -n src/Usage/Adapter.php

Repository: utopia-php/usage

Length of output: 3630


🏁 Script executed:

# Search for 'time' attribute definition specifically
rg -B 3 -A 3 "'time'" src/Usage/Adapter.php

Repository: utopia-php/usage

Length of output: 42


🏁 Script executed:

# Search for how attributes are defined - looking for 'metric', 'value', 'period', 'time', 'tags' 
rg -B 5 -A 20 "metric.*value.*period|'metric'|'period'|'tags'" src/Usage/Adapter/ClickHouse.php | head -100

Repository: utopia-php/usage

Length of output: 3660


🏁 Script executed:

# Look at how 'metric', 'value', etc. are mapped to columns
rg -B 3 -A 3 "getAttribute.*time|'time'.*required" src/Usage/Adapter/ClickHouse.php

Repository: utopia-php/usage

Length of output: 42


🏁 Script executed:

# Search for where the schema attributes are hardcoded or predefined
rg -B 5 -A 20 "buildInsertColumns|columnDefs.*=" src/Usage/Adapter/ClickHouse.php | head -80

Repository: utopia-php/usage

Length of output: 2698


🏁 Script executed:

# Search for getAttributes implementation
rg -A 30 "function getAttributes" src/Usage/Adapter/ClickHouse.php

Repository: utopia-php/usage

Length of output: 42


🏁 Script executed:

# Check if 'time', 'metric', 'value', 'period', 'tags' are hardcoded or predefined
rg -B 10 "columns.*metric.*value|columns.*=.*\[\]" src/Usage/Adapter/ClickHouse.php | head -80

Repository: utopia-php/usage

Length of output: 294


🏁 Script executed:

# Look for where columns array is built and check if time is always included
rg -B 20 "time.*DateTime64|columns\[\].*time" src/Usage/Adapter/ClickHouse.php | head -100

Repository: utopia-php/usage

Length of output: 2718


🌐 Web query:

ClickHouse DateTime64 non-nullable column empty string parameter NULL behavior insert

💡 Result:

If you insert an empty string ('') into a non-nullable DateTime64 column, ClickHouse does not treat that as NULL. What happens depends on how you’re inserting (format/settings):

  • CSV input: empty fields are treated as the column default value when input_format_csv_empty_as_default=1 (this is true by default)—so an empty field for DateTime64 becomes its default (typically the “zero” datetime) rather than NULL. [2]
  • JSONEachRow input: empty strings are only replaced with defaults if you enable input_format_json_empty_as_default=1. Otherwise, the empty string is just a string and parsing can fail. [3]
  • Actual NULL into a non-nullable column throws an error unless you enable input_format_null_as_default=1, in which case NULL is replaced by the column’s default. [1][6]

How to handle “empty string means missing” safely

Option A (recommended): make the column nullable

ts Nullable(DateTime64(3))

Then send NULL when you mean missing.

Option B: convert empty string to NULL, then choose a fallback

INSERT INTO t (ts)
VALUES (coalesce(toDateTime64OrNull(nullif({p:String}, ''), 3), toDateTime64('1970-01-01 00:00:00', 3)));

toDateTime64OrNull returns NULL on invalid/empty input. [4]

Option C: convert empty/invalid directly to the default

INSERT INTO t (ts)
VALUES (toDateTime64OrDefault({p:String}, 3));

toDateTime64OrDefault returns the provided default, or 1970-01-01 00:00:00.000 if you don’t provide one. [5]

Sources: [1] Format setting input_format_null_as_default [2] CSV input_format_csv_empty_as_default [3] JSON input_format_json_empty_as_default [4] toDateTime64OrNull [5] toDateTime64OrDefault [6] non-nullable NULL insert error


Set a deterministic timestamp for PERIOD_INF instead of relying on implicit ClickHouse coercion.

The time column is non-nullable DateTime64(3) and used as a partition key. When PERIOD_INF is used, $timestamp is set to null, which becomes an empty string when sent as a parameter. ClickHouse then silently coerces the empty string to its default datetime (1970-01-01 00:00:00), making the behavior implicit and fragile.

Use an explicit sentinel timestamp (e.g., '1970-01-01 00:00:00.000') or reconsider the column design:

Proposed fix (sentinel timestamp)
-        $time = $period === Usage::PERIOD_INF
-            ? null
-            : $now->format(Usage::PERIODS[$period]);
-        $timestamp = $time !== null ? $this->formatDateTime($time) : null;
+        $time = $period === Usage::PERIOD_INF
+            ? '1970-01-01 00:00:00.000'
+            : $now->format(Usage::PERIODS[$period]);
+        $timestamp = $this->formatDateTime($time);
🤖 Prompt for AI Agents
In `@src/Usage/Adapter/ClickHouse.php` around lines 683 - 688, When PERIOD_INF is
used the code sets $time to null which yields an empty string param and lets
ClickHouse coerce to 1970 implicitly; instead set a deterministic sentinel
datetime for $timestamp when Usage::PERIOD_INF is selected. Update the block
around $now/$time/$timestamp (used with Usage::PERIODS and formatDateTime) to
assign a concrete sentinel string (e.g. '1970-01-01 00:00:00.000') and pass that
through $this->formatDateTime so $timestamp is never null, ensuring partition
key behavior is explicit and deterministic.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/Usage/Adapter/Database.php`:
- Around line 482-486: The setSharedTables method currently calls itself causing
infinite recursion; change the internal call from
$this->setSharedTables($sharedTables) to call the underlying database adapter
$this->db->setSharedTables($sharedTables) (ensure $this->db is the correct
property and initialized) and then return $this to preserve fluent interface in
the setSharedTables method.
🧹 Nitpick comments (3)
src/Usage/Adapter/Database.php (3)

93-93: Unused $batchSize parameter.

The $batchSize parameter is declared but never used. Either implement batching logic to process documents in chunks of $batchSize, or remove the parameter to avoid confusion.

Option 1: Implement batching
     public function logBatch(array $metrics, int $batchSize = 1000): bool
     {
         $this->db->getAuthorization()->skip(function () use ($metrics, $batchSize) {
             $documentsById = [];
             // ... existing aggregation logic ...
 
-            if (!empty($documents)) {
-                $this->db->upsertDocumentsWithIncrease($this->collection, 'value', $documents);
-            }
+            if (!empty($documents)) {
+                foreach (array_chunk($documents, $batchSize) as $batch) {
+                    $this->db->upsertDocumentsWithIncrease($this->collection, 'value', $batch);
+                }
+            }
         });

188-188: Unused $batchSize parameter (same as logBatch).

Apply the same batching fix as suggested for logBatch(), or remove the parameter.


388-407: Purge implementation is functional but could be optimized.

The current approach deletes documents one-by-one in a loop. While this works correctly, it may be slow for large datasets. Consider using a bulk delete operation if available in the Utopia\Database library.

Comment on lines +482 to +486
public function setSharedTables(bool $sharedTables): self
{
$this->setSharedTables($sharedTables);
return $this;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Infinite recursion in setSharedTables().

Line 484 calls $this->setSharedTables($sharedTables) instead of $this->db->setSharedTables($sharedTables), causing a stack overflow at runtime.

Required fix
     public function setSharedTables(bool $sharedTables): self
     {
-        $this->setSharedTables($sharedTables);
+        $this->db->setSharedTables($sharedTables);
         return $this;
     }
🤖 Prompt for AI Agents
In `@src/Usage/Adapter/Database.php` around lines 482 - 486, The setSharedTables
method currently calls itself causing infinite recursion; change the internal
call from $this->setSharedTables($sharedTables) to call the underlying database
adapter $this->db->setSharedTables($sharedTables) (ensure $this->db is the
correct property and initialized) and then return $this to preserve fluent
interface in the setSharedTables method.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/Usage/Adapter/ClickHouse.php`:
- Around line 912-915: When Usage::PERIOD_INF is detected the code sets
$timestamp to null which relies on ClickHouse coercion into the non-nullable
DateTime64(3) time column; change the branch that sets $timestamp so it uses an
explicit sentinel timestamp (for example the epoch or another agreed sentinel)
formatted via the existing formatDateTime() helper instead of null so the
$timestamp variable written to the time column is deterministic (refer to
Usage::PERIOD_INF, $timestamp, formatDateTime(), and the time column
DateTime64(3)).
- Line 896: The prepareMetricRow function is declared to return ?array but
always returns an array; update its signature to return array (remove the
nullable) and adjust any PHPDoc or type hints that reference ?array to array;
specifically change the method declaration for prepareMetricRow(array
$metricData): ?array to prepareMetricRow(array $metricData): array and ensure
callers and static analyzers remain consistent with the non-nullable return
type.
🧹 Nitpick comments (2)
src/Usage/Adapter/ClickHouse.php (2)

1243-1256: Simplify value extraction — $singleValue only assigned conditionally but used unconditionally.

Query::getValues() always returns an array, making is_array($values) always true. However, $singleValue is only assigned inside the if (is_array($values)) block but used unconditionally at line 1248. This creates fragile code structure.

♻️ Proposed refactor
                 case Query::TYPE_LESSER_EQUAL:
                     $this->validateAttributeName($attribute);
                     $escapedAttr = $this->escapeIdentifier($attribute);
                     $paramName = 'param_' . $paramCounter++;
+                    $singleValue = $values[0] ?? null;
                     if ($attribute === 'time') {
-                        if (is_array($values)) {
-                            /** `@var` \DateTime|string|null $singleValue */
-                            $singleValue = $values[0] ?? null;
-                        }
                         $filters[] = "{$escapedAttr} <= {{$paramName}:DateTime64(3)}";
                         $params[$paramName] = $this->formatDateTime($singleValue);
                     } else {
-                        if (is_array($values)) {
-                            /** `@var` bool|float|int|string $singleValue */
-                            $singleValue = $values[0] ?? null;
-                        }
                         $filters[] = "{$escapedAttr} <= {{$paramName}:String}";
                         $params[$paramName] = $this->formatParamValue($singleValue);
                     }
                     break;

1264-1277: Same refactor needed for TYPE_GREATER_EQUAL.

Same issue as TYPE_LESSER_EQUAL$singleValue is conditionally assigned but unconditionally used.

♻️ Proposed refactor
                 case Query::TYPE_GREATER_EQUAL:
                     $this->validateAttributeName($attribute);
                     $escapedAttr = $this->escapeIdentifier($attribute);
                     $paramName = 'param_' . $paramCounter++;
+                    $singleValue = $values[0] ?? null;
                     if ($attribute === 'time') {
-                        if (is_array($values)) {
-                            /** `@var` \DateTime|string|null $singleValue */
-                            $singleValue = $values[0] ?? null;
-                        }
                         $filters[] = "{$escapedAttr} >= {{$paramName}:DateTime64(3)}";
                         $params[$paramName] = $this->formatDateTime($singleValue);
                     } else {
-                        if (is_array($values)) {
-                            /** `@var` bool|float|int|string $singleValue */
-                            $singleValue = $values[0] ?? null;
-                        }
                         $filters[] = "{$escapedAttr} >= {{$paramName}:String}";
                         $params[$paramName] = $this->formatParamValue($singleValue);
                     }
                     break;

Comment on lines +912 to +915
$time = $period === Usage::PERIOD_INF
? null
: $now->format(Usage::PERIODS[$period]);
$timestamp = $time !== null ? $this->formatDateTime($time) : null;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Null timestamp for PERIOD_INF relies on implicit ClickHouse coercion.

When period === Usage::PERIOD_INF, $timestamp becomes null (line 915), which is then assigned to the time column (line 929). Since time is a non-nullable DateTime64(3) column defined at line 489, ClickHouse will silently coerce the null/empty value to 1970-01-01 00:00:00, making the behavior implicit and fragile.

Consider using an explicit sentinel timestamp for deterministic behavior.

🐛 Proposed fix (explicit sentinel)
         // Period-aligned time
         $now = new \DateTime();
         $time = $period === Usage::PERIOD_INF
-            ? null
+            ? '1970-01-01 00:00:00.000'
             : $now->format(Usage::PERIODS[$period]);
-        $timestamp = $time !== null ? $this->formatDateTime($time) : null;
+        $timestamp = $this->formatDateTime($time);

Also applies to: 929-929

🤖 Prompt for AI Agents
In `@src/Usage/Adapter/ClickHouse.php` around lines 912 - 915, When
Usage::PERIOD_INF is detected the code sets $timestamp to null which relies on
ClickHouse coercion into the non-nullable DateTime64(3) time column; change the
branch that sets $timestamp so it uses an explicit sentinel timestamp (for
example the epoch or another agreed sentinel) formatted via the existing
formatDateTime() helper instead of null so the $timestamp variable written to
the time column is deterministic (refer to Usage::PERIOD_INF, $timestamp,
formatDateTime(), and the time column DateTime64(3)).

…s, update parsing logic, and add comprehensive tests.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@src/Usage/Adapter/ClickHouse.php`:
- Around line 1078-1085: In the count() method after $result =
$this->query($sql, $params) and $json = json_decode($result, true), guard the
decoded value before indexing: verify that $json is an array, that
isset($json['data']) and is_array($json['data']), that isset($json['data'][0])
and is_array($json['data'][0]), and that isset($json['data'][0]['total']) before
casting and returning it; if any check fails return 0. Use the existing symbols
query(), $result, $json and the count() method to locate where to add these
checks.
- Around line 1559-1566: The JSON decode in sumByPeriod() is used without
guarding against failures or non-array shapes; update the code around $result =
$this->query($sql, $parsed['params']); and $json = json_decode($result, true) to
validate that $json is an array, contains a 'data' key that is an array, that
index 0 exists, and that 'grand_total' exists before indexing; if any check
fails return 0 (optionally also handle json_decode errors via json_last_error()
for clarity). Ensure you reference the existing query() call and the $json
variable so the checks are added right before accessing
$json['data'][0]['grand_total'].
- Around line 1330-1372: In parseResults(), guard against json_decode and mixed
types: verify $json is an array and that isset($json['data']) yields an array
before iterating (e.g., ensure $rows is an array), handle json_decode($result,
true) possibly returning null; for each $row ensure it is an array before
foreach; when handling 'tenant' cast only if is_numeric($value) or not null;
when handling 'time' ensure $value is a string before calling strpos/str_replace
and otherwise leave as null or format-safe fallback; when handling 'tags' check
is_string($value) and that json_decode($value, true) returns an array (use
is_array() fallback to empty array) and otherwise preserve arrays/objects; and
generally use is_array/is_string checks to satisfy PHPStan for mixed-offset and
type errors.

In `@tests/Usage/Adapter/ClickHouseTest.php`:
- Around line 319-326: Remove the unused local variables $now and $oldTime from
the test — delete the lines that assign $now = DateTime::now(); and $oldTime =
(new \DateTime())->modify('-2 hours'); and update or retain only the explanatory
comments around the calls to $this->usage->log('metric-A', ...),
$this->usage->log('metric-B', ...), and the comment about relying on creation
times; ensure no other code references $now or $oldTime so PHPMD warnings go
away.

Comment on lines +319 to +326
$now = DateTime::now();
// metric A: value 10, time NOW
$this->usage->log('metric-A', 10, '1h', ['category' => 'cat1']);
// metric B: value 20, time NOW
$this->usage->log('metric-B', 20, '1h', ['category' => 'cat2']);
// metric C: value 30, time NOW - 2 hours
$oldTime = (new \DateTime())->modify('-2 hours');
// We can't easily force time in log(), so we just rely on metrics created now being "newer" than this timestamp
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unused locals flagged by PHPMD.

$now and $oldTime are unused; drop them to keep the test clean.

🛠️ Proposed fix
-        $now = DateTime::now();
-        // metric A: value 10, time NOW
+        // metric A: value 10, time NOW
         $this->usage->log('metric-A', 10, '1h', ['category' => 'cat1']);
         // metric B: value 20, time NOW
         $this->usage->log('metric-B', 20, '1h', ['category' => 'cat2']);
-        // metric C: value 30, time NOW - 2 hours
-        $oldTime = (new \DateTime())->modify('-2 hours');
-        // We can't easily force time in log(), so we just rely on metrics created now being "newer" than this timestamp
+        // We can't easily force time in log(), so we just rely on metrics created now being "newer" than this timestamp
🧰 Tools
🪛 PHPMD (2.15.0)

[warning] 319-319: Avoid unused local variables such as '$now'. (undefined)

(UnusedLocalVariable)


[warning] 325-325: Avoid unused local variables such as '$oldTime'. (undefined)

(UnusedLocalVariable)

🤖 Prompt for AI Agents
In `@tests/Usage/Adapter/ClickHouseTest.php` around lines 319 - 326, Remove the
unused local variables $now and $oldTime from the test — delete the lines that
assign $now = DateTime::now(); and $oldTime = (new \DateTime())->modify('-2
hours'); and update or retain only the explanatory comments around the calls to
$this->usage->log('metric-A', ...), $this->usage->log('metric-B', ...), and the
comment about relying on creation times; ensure no other code references $now or
$oldTime so PHPMD warnings go away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants